University of British Columbia & Simon Fraser University – The Bricolage

VAST 2007 Contest Submission

 

Authors and Affiliations:

The following authors are listed in alphabetical order by last name:

William Chao

University of British Columbia

will@ubcviscog.com

Daniel Ha

School of Interactive Arts & Technology (SIAT), Simon Fraser University

dha1@sfu.ca

Kevin Ho

University of British Columbia

kevin@ubcviscog.com

Linda Kaastra

Media and Graphics Interdisciplinary Centre (MAGIC), University of British Columbia

lkaastra@interchange.ubc.ca

Minjung Kim

University of British Columbia

minjung@ubcviscog.com

Andrew Wade

University of British Columbia

andrewwade@ubcviscog.com

This project was supported by the Natural Science and Engineering Research Council of Canada.

 

Student team: [ X ] YES  [   ] NO 
If you answered yes, name the faculty who agreed to be your sponsor:     Brian Fisher, bfisher@sfu.ca

Tool(s):

For the VAST 2007 contest problem, our team used a variety of commercial and open source tools to support our analytic task. We did not pre-select a specific toolset to support this task, but rather evaluated and selected both specialized and generic tools along the way for answering emergent questions, generating plausible hypotheses, and supporting evidence-based argumentation. We refer to this as the bricolage approach: the application of multiple analytic methods by multiple analysts, using an assortment of specialized tools that could be combined to tackle a complex problem space.

We used a large number of different tools, consisting primarily of those that were both easy to learn and readily available. Of these, nine tools in particular contributed significantly to our overall understanding of the problem and are summarized in the chart below. These tools are described in further detail throughout Section 5, Visuals and Description of the Analytical Process. The links lead to the websites of the software.

 

Summary of Tools

Tool

Developer

Description

Release Used

Open Source?

Text and Qualitative Data Analysis

ATLAS.ti

ATLAS.ti Scientific Software Development GmbH

Qualitative data analysis tool capable of automatically coding a data set.

5.2.9

No

Stanford POSTagger

Stanford Natural Language Processing Group

Identifies and tags every word with its part-of-speech (e.g., noun, verb). Used to identify proper nouns, and hence, names of people.

2006-05-21

Yes

TextSTAT

Dutch Linguistics, Free University of Berlin

Text analysis tool, used to search and identify occurrences of important names or words.

2.7

Yes

Search Utility

Organizers of VAST 2006 contest

Text searching tool provided with the pre-processed VAST 2006 data set. Used to search and identify occurrences of important names or words.

VAST 2006

No

 

Data Sharing and Visualizations

Dia

Multiple authors; primary author is Alexander Larson

Diagramming tool.

0.96.1-7

Yes

FreeMind

Multiple authors; primary contributors are Jörg Müller, Daniel Polansky, Petr Novak, Christian Foltin, and Dimitry Polivaev

Mind-mapping software. Used to take notes.

0.8.0

Yes

Google Documents & Spreadsheets

Google

Web applications that mimic the capabilities of Microsoft Word and Excel, but with an additional social and collaborative dimension. Used to collect and discuss notes as a group.

2007

No

GraphViz

AT&T

Diagramming tool. Automatically generates relationship diagrams based on simple, tab-delimited text.

2.12

Yes

Microsoft Excel

Microsoft

Spreadsheet software.

2003

No

IHMC CmapTools

IHMC

Diagramming tool. Automatically generates relationship diagrams based on simple, tab-delimited text.

4.10

No

Timeline Maker

Progeny Software, Inc.

Used to make timelines.

Trial version

No

 

Data set used:   [ X ] RAW DATA SET     [   ] PRE-PROCESSED  SET

 

TOC:  WhoWhatWhereDebriefing - Process - Video

 


 

1. WHO: who are the players engaging in questionable activities in the plot(s)?   When appropriate, specify the association they are associated with

Name

Associated organization

Involved in illegal activities? (Yes/No)

Involved in terrorist activities? (Yes/No)

Most relevant source files (5 MAX)  

Abu Hassan

Global Ways,

Professor Assan and His Amazing Animals

Yes

No

Week-of-Mon-20031215-1.txt_91,

Week-of-Mon-20040301-1.txt_75,

ImportPermitsv3 BEST WORKING COPY

Catherine Carnes

SPOMA

No

No

Chinchilla Dreamin’,

Week-of-Mon-20030526-2.txt_57,

Week-of-Mon-20030818.txt_23

Cesar Gil

None

Yes

Yes

Chinchilla Dreamin’,

Week-of-Mon-20030609.txt_4,

Week-of-Mon-20040705.txt_86

Faron Gardner

Animal Justice League

Yes

Yes 

Chinchilla Dreamin’,

Week-of-Mon-20030602-1.txt_66,

Week-of-Mon-20030818.txt_23

Luella Vedric

SPOMA

Yes

No

Week-of-Mon-20030526-2.txt_57,

Week-of-Mon-20031013.txt_4,

Week-of-Mon-20040119-1.txt_98,

Week-of-Mon-20040412-2.txt_13

Week-of-Mon-20040705.txt_83,

Madhi Kim

Global Ways

Yes

No

Week-of-Mon-20040308.txt_109,

Week-of-Mon-20040412-2.txt_13

Navarro Mercurio

Global Ways

Yes

No

meeting, Tropical Fish Importers

r’Bear

Shravaana / Shraavana

No

No

Week-of-Mon-20030609.txt_7,

Week-of-Mon-20040412-2.txt_13,

Week-of-Mon-20040628.txt_61,

Week-of-Mon-20060614.txt_94

Rosalind Baptista

Unknown

Yes

No

Chinchilla Dreamin’, hunt8, meeting

 


 

2. WHEN /WHAT:   What events occurred during this time frame that are most relevant to the plot(s)? 

 

Date
Can be a range

Event description

Most relevance source files

(5 Max)

1

July 18, 2003

Rosalind Baptista is seen poaching chinchillas.

hunt8

2

August 15, 2003

Cesar Gil becomes a chinchilla farmer.

Chinchilla Dreamin’

3

September 1, 2003

Cesar Gil announces that Gil Breeders is selling chinchillas at West LA farmers market.

Week-of-Mon-20030901-1.txt_36

4

September 22, 2003

Global Ways advertises their fish import service, highlighting their low rates of death-on-arrival (DOA).

Week-of-Mon-20030922.txt_28

5

October 27, 2003

Letters to editor complaining about poor Global Ways shipments are published. Fish shipping bags are noted to be covered in noxious substance that causes numbness of hands. Global Ways blames an inexperienced packer in South America.

cocaine hydro,

Transport of Live Fish,

Week-of-Mon-20031027.txt_57

6

December 15, 2003

A letter to CITES, urging the shut-down of Assan Circus, is published. Abu Hassan, owner of the circus, is accused of smuggling chimps and parrots, as well as mistreating animals.

Week-of-Mon-20031215-1.txt_91

7

January 6, 2004

Fish and Wildlife Services issues an advisory to ornamental fish merchants about contaminated fish shipping packages. Several fish import companies located in Miami are named as possible sources of contaminated shipments, including Global Ways.

Week-of-Mon-20040105-1.txt_58

8

January 20, 2004

The eighth annual SPOMA dinner is hosted by Luella Vedric. The performance by r’Bear, famed rapper, is not accepted very well, despite his donation of $80,000.

Week-of-Mon-20040119-1.txt_98

9

March 2, 2004

CITES-issued confiscation of Abu Hassan’s circus animals is reported. Hassan is presumed to have fled the country.

Week-of-Mon-20040301-1.txt_75

10

March 2, 2004

Cesar Gil posts a Chinsurrection comic to his blog, depicting a chinchilla becoming infected with an unnamed disease.

Chinchilla Dreamin’

11

March 13, 2004

Madhi Kim, the CEO of Global Ways, visits r’Bear’s wildlife preservation ranch, Shravaana. Madhi Kim is reported to own a canned hunting ranch, Wild Things.

Week-of-Mon-20040308.txt_109

12

April, 2004

Navarro Mercurio (“MN”) and Rosalind Baptista (“RB”) are photographed meeting in New Orleans.

Meeting

13

April 18, 2004

Nights of Champagne and Tropical Fish, a celebration of wine and fish hosted by Global Ways. Madhi Kim invites Luella Vedric and r’Bear (spelled “r’Bert” in the article).

Week-of-Mon-20040412-2.txt_13

14

June 2, 2004

Cesar Gil posts a Chinsurrection comic to his blog, depicting a mass spread of illness through chinchillas.

Chinchilla Dreamin’

15

June 20, 2004

r’Bear announces the arrival of over 500 new animals to Shravaana, including some short-tailed chinchillas.

Week-of-Mon-20040614.txt_94

16

June 30, 2004

Cesar Gil posts a Chinsurrection comic to his blog, depicting the anticipated delivery of sick chinchillas by “Senorita Baptista.”

Chinchilla Dreamin’

17

July 1, 2004

r’Bear is admitted to the hospital with monkeypox-like symptoms.

Week-of-Mon-20040628.txt_61

18

July 7, 2004

Seven people in the LA region are reported to be seriously ill with monkeypox. This is the second monkeypox outbreak in the US.

Week-of-Mon-20040705.txt_83

19

July 24, 2004

Two people die from monkeypox. As a result of the outbreak, international animal transportation becomes more tightly regulated. Cesar Gil is wanted in suspicion of connection with the outbreak, but is presumed to have fled the country.

Week-of-Mon-20040705.txt_86

 


 

3. WHERE: What locations are most relevant to the plot(s)?

 

Location

Description

Most relevance source files

(5 Max)

1

Southern California (Los Angeles and San Diego)

Site of monkeypox outbreak in July 2004. Cesar Gil’s chinchilla farm, Gil Breeders, as well as r’Bear’s wildlife preservation ranch, Shravaana, are located in the Southern California region.

Week-of-Mon-20040705.txt_83,

Week-of-Mon-20040628.txt_61,

Week-of-Mon-20040614.txt_94

2

Miami

Location of a Global Ways branch, managed by Navarro Mercurio. Tropical fish imported through this branch in autumn 2003 suffer from high death-on-arrival rates, and are shipped in packaging covered in a noxious substance.

Week-of-Mon-20031027.txt_57,

Week-of-Mon-20040105-1.txt_58,

Tropical Fish Importers

3

Chile

Native habitat of chinchillas, and hence, the site of chinchilla poaching. Rosalind Baptista is photographed hunting here.

Chinchilla Dreamin’, hunt8

4

New Orleans

The meeting location between Navarro Mercurio (“MN”) and Rosalind Baptista (“RB”).

Meeting

5

New York

Location of a Global Ways branch. Permits for Abu Hassan originate from here.

ImportPermitsv3 BEST WORKING COPY

 


 

4. DEBRIEFING

In the spring of 2003, chinchillas gain popularity as the new “fad pets” in the US: animals, previously uncommon in household setting, which suddenly become fashionable as pets. Consequently, many animal rights activists become concerned that the increase in the popularity of these pets might also mean a rise in incidences of animal neglect and other forms of abuse.

In particular, the fad offended Cesar Gil, a biologist in the Los Angeles region, who worried that the increased demand for chinchillas would result in increased rates of chinchilla poaching in South America. Gil thus devised a scheme—the “Chinsurrection”—that was designed to cause a decline in their popularity by infecting them with monkeypox, and therefore making them a carrier of a disease that is potentially fatal to humans. In the summer of 2003, Gil opened a chinchilla farm, called Gil Breeders, where he began raising chinchillas to be used in his plot, while establishing himself as a trustworthy vendor of chinchillas in LA. The progress of his scheme is chronicled on his blog, Chinchilla Dreamin’, within the musings of a comic strip.

The monkeypox plot culminated in July of 2004, when seven people were reported to be ill with monkeypox, including the megastar rapper, r’Bear. By July 24, two people had died from monkeypox. Meanwhile, Cesar Gil was nowhere to be found, presumably having fled the country.

The distribution of monkeypox-infected chinchillas can be tied to a privately-traded company called Global Ways. On the surface, Global Ways appears to be an import-export company that specializes in the import of rare and exotic tropical fish. However, Global Ways is also involved in animal and cocaine smuggling operations, especially from South America and Africa. In particular, we recognize Madhi Kim, the CEO of Global Ways, and Navarro Mercurio, the office manager of Global Ways Miami branch, to be intricately involved with these illegal dealings.

Global Ways’ involvement with animal smuggling is closely linked with Abu Hassan, the owner of the circus, “Professor Assan and His Amazing Animals.” Through this African circus, Hassan obtained exotic species of parrots and chimpanzees to be imported to the US. His notoriety, both in animal abuse and in smuggling, eventually leads to a raid on his circus by the Convention on International Trade in Endangered Species (CITES) on March 2, 2004, in Zimbabwe. Many of his animals are confiscated at this time, although Hassan himself is missing, presumably having gone into hiding.

A series of tropical fish shipments through Miami revealed that Global Ways smuggles cocaine from South America. In fall of 2003, reports of poor quality fish transports—with fatality rates as high as 80%—by Global Ways are publicized. Furthermore, the packages were covered in an unidentified toxin, which caused some workers to require emergency medical attention. According to the support material provided, live fish are transported in plastic bags, packaged in a Styrofoam case lined with insulation material. We suspect that the insulation material in some of the fish shipments may have been cocaine. This hypothesis is supported by the symptoms exhibited by fish handlers, which are suspiciously similar to those of acute cocaine poisoning through skin contact and inhalation: numbness and tingling of hands, dilated eyes, difficulty breathing, and euphoria. Global Ways blamed an inexperienced fish handler in South America for this incident. Later, suspicion of Global Ways became somewhat diffused when nine additional fish importers are named as potential sources of the contaminated fish.

Since the suspected cocaine was trafficked through Miami, it is very likely that Navarro Mercurio is aware of, and is directing, these illegal activities. We suspect that Mercurio is also involved with Global Ways’ animal smuggling ring, and propose that Mercurio may be “M.N.,” who was photographed meeting with “R.B.” in New Orleans during April of 2004. “R.B.” in the photo is likely Rosalind Baptista, a known chinchilla poacher, who was photographed illegally hunting chinchillas in Choapa Valley, Chile. Cesar Gil appears to be aware of Baptista’s role as an illegal distributor of chinchillas to retailers, as demonstrated by his cryptic remarks on the Chinsurrection strip posted on June 30, 2004: “Senorita Baptista delivers [the chinchillas] t’morra!” It should be noted that the chinchilla depicted in this strip appears to be a carrier of disease—likely monkeypox—which causes the pet owner in the comic to fall ill.

The hypothesis that Navarro Mercurio and Rosalind Baptista are the identities of M.N. and R.B. is supported by r’Bear’s strange illness in July of 2004, likely a monkeypox infection from sick chinchillas. We know that Madhi Kim and r’Bear are on cordial terms. For instance, on March 13, 2004, Kim was invited to r’Bear’s wildlife preservation ranch, Shravaana. In mid-April, 2004, r’Bear was invited to the Global Ways Nights of Champagne and Tropical Fish as a “special guest” of Kim. Furthermore, they have a mutual friend, Luella Vedric, who shares their interests in uncommon animals. Therefore, we can postulate that some of the 500 animals r’Bear acquires in June of 2004, including the short-tailed chinchillas, are supplied by Madhi Kim representing Global Ways. This, combined with Rosalind Baptista’s infected chinchillas, points to Navarro Mercurio as a likely candidate that mediates the chinchilla connection from Cesar Gil to Global Ways, and subsequently to r’Bear. Unfortunately, how Gil knows of Baptista is unclear, and raises several questions regarding the nature of their connection: Does she supply him with chinchillas for his farm? Does he supply her with monkeypox-infected chinchillas? If so, is she aware of the infection? Who is his informant? This trail of infected chinchillas is studded with highly suspicious characters, and warrants further investigation.

We also recommend an investigation of Luella Vedric, a socialite and an outspoken member of the Society for the Prevention of Mistreatment of Animals (SPOMA). On the surface, Vedric appears to be, in every way, an animal rights activist, even hosting the eighth annual SPOMA dinner in January of 2004. She is also a long-time friend of Catherine “Collie” Carnes, the spokesperson of SPOMA, and was reported to be helping track Abu Hassan’s circus to stop animal cruelty. Yet, simultaneously, she openly associates with Madhi Kim, who owns a canned hunting ranch and trades with Abu Hassan—the very man she provided information to stop. These facts place Vedric in an awkward position between innocence and guilt: Is Vedric genuinely attempting to stop animal abuse by getting information about Abu Hassan through Kim? Or is she motivated by something else?

 


 

5. VISUALS and Description of ANALYTICAL PROCESS

Our analytical process can be described in four major stages: information generation, schematization, argumentation and schema shifting, and decision-making. Our problem solving approach gradually shifted from independent to collaborative work as we progressed through the stages. The shift was deliberate and the tools chosen along the way reflect this change.

It should be noted that, for the purposes of discussion, we describe the analytical process in neat, segmented phases; the actual problem solving process was neither discrete, nor always linear.

1. Information Generation

 a. Information Discovery and Foraging

The process of analysis began by familiarizing ourselves with the provided data set and by discovering categories of information—such as names, places, and events—from the data. Independent information discovery was encouraged at this phase, in order to mitigate the potential for groupthink: a faulty, conforming style of group analysis that can lead to poor decision making. We allowed ourselves to freely discover the data space with little constraints, and discouraged each other from the sharing of individual findings until all team members had a chance to complete at least a single iteration of the information discovery stage. Our intent was to promote the generation of a diverse set of hypotheses. For this reason, most tools and techniques used for information discovery were selected primarily on their ability to support rapid entity extraction.

A preliminary timeline, for instance, was created by placing news articles and images in a common file folder, and re-naming them to reflect their date of creation [Fig.1]. This enabled us to get an immediate sense of the data space, allowing us to approximate both the total quantity of information and the distribution of information through time.

 

Figure 1. Windows Explorer. News articles and images are pooled together in a single folder, forming a rough timeline of events.

Figure 2. Appended news articles in a in a text editor. The location of the scroll bar approximates our position in the overall timeline.

 

Given the relatively small size of the data set, some of the team members opted to glance through most of the corpus to get a good sense of the themes. We believed that, by becoming familiar with some of the storylines present, we would be able to recognize and organize information around thematic categories when the information was examined in more detail. Rather than read each file individually, we appended all of the news articles together in a single text file, then read it in a simple text editor. This allowed us to glance at the content quickly while scrolling down, as well as use the location of the scroll bar to estimate our position in the overall timeline [Fig.2]. We also tracked our advancement in an activity log, forcing ourselves to become conscious of our progress in terms of the competition deadline.

Generation of entities was accomplished, in part, by the Stanford Part-Of-Speech Tagger (POSTagger), an open-source software developed by the Stanford Natural Language Processing Group. It parses a text file and appends a tag to every word present, thereby identifying its part-of-speech (e.g., noun, verb) [Fig.3]. Using the POSTagger, we were able to isolate most of the proper nouns—and hence, names of potential interest—present in the corpus.

Figure 3. The Stanford POSTagger. “Vedric” has been recognized as a proper noun. We cross-referenced this name with a list of word frequencies, and concluded that Vedric may be an important character.

We found that word frequency was a good heuristic by which to discard unwanted entities. By limiting ourselves to words that only occur two to five times in the corpus, we were able to extract some names that were neither extremely common, nor extremely rare. Words that occur hundreds of times—for instance, PETA—were deemed to be false leads. Similarly, words that only occur once in the entire corpus—such as LaRae—were also assumed to be irrelevant entities. It should be noted that words were not permanently discarded in this stage; the list of names obtained from the POSTagger were merely considered to be a starting point for conducting specific searches into the data set, with the understanding that previously discarded words may, in fact, be important entities that may need to be re-introduced to our list of entities.

The POSTagger was used in conjunction with other text analysis tools, such as ATLAS.ti, a commercial tool used in the social sciences community for supporting analysis of large corpora. It was used by one of the analysts in the early stages of information discovery to find and extract entities. ATLAS.ti facilitates fast searches with its auto-coding tool, which creates custom search scripts using GREP regular expressions, then runs it against a large number of documents to codify entities of interest for later analysis [Fig.4a, 4b, 4c]. It also provides a visual relations editor that allows analysts to assign both known and hypothetical relations between entities, then to output it as a graphic file for group discussion and argumentation [Fig.4d].

Figure 4a. ATLAS.ti’s auto-coding feature.

Figure 4b. Entities created through auto-coding.

Figure 4c. Coded data.

Figure 4d. ATLAS.ti’s network view of entity relationships.

 

Other text analysis tools include TextSTAT, an open-source software, and Search Utility, a program provided with the VAST 2006 data set.

Searching for strings in TextSTAT returns results in its concordance view, which arranges search terms in context of the text surrounding them [Fig.5a]. Double-clicking on a search result opens the citation view, which shows an even larger excerpt surrounding the search term [Fig.5b]. Figure 5a demonstrates an example search using the string “chinchilla.” The uniform, vertical alignment of search results in the middle of the screen allows quick scan of the text snippets, and facilitates finding articles of interest—in this case, the article relating chinchillas to the monkeypox outbreak. Each line of the concordance view corresponds to a single occurrence of the search term; articles containing multiple occurrences of the search term, therefore, were listed multiple times.

 

Figure 5a. TextSTAT’s concordance view. A number of characters surrounding the search term are returned along with the search result, providing context.

Figure 5b. TextSTAT’s citation view. A  larger segment surrounding the search result is displayed.

 

Unlike TextSTAT, Search Utility only returns one result per news article [Fig.6]. Search Utility, therefore, allowed us to easily approximate the number of articles related to the search string, which was difficult to accomplish with TextSTAT. The two tools were used in conjunction to complement each other’s features.

 

Figure 6. Search Utility only returns one result per article.

 

Interesting results from the information discovery stage were vigilantly recorded, primarily in the form of daily logs that marked our progress and leads. Some of the analysts chose to depict their newest leads using FreeMind, an open-source mind-mapping software, in effect visualizing the locally explored regions of the overall problem space [Fig.7a, 7b]. It provided an effective means by which to monitor the information that has already been discovered, and subsequently, identify avenues of further research later on. Figure 7a shows that following up on PetSmart and Animal Justice League (AJL) led to a connection between Cesar Gil and Faron Gardner. Whether or not PetSmart’s chinchillas are directly related to Cesar Gil is, however, unclear. Other leads resulted in dead ends (e.g., Tony Jones) or exploded into large terrorist stories that did not appear to be linked to the main plot (e.g., Chiron, thrashing of biology lab in Louisiana).

 

Figure 7a. An example of tracking leads using FreeMind. The above image demonstrates a follow-up on the raid on PetSmart by Faron Gardner and the Animal Justice League.

Figure 7b. Another example of tracking leads using FreeMind.

 

b. Information Pooling, Data Transformations and Integration

This stage marked the beginning of real collaborative work. Due to the independent nature of the previous phase, each analyst had an incomplete mental representation of the data. Differing preliminary hypotheses about the significance of entities resulted in the usage of different representations to communicate ideas. The tools selected in this stage, therefore, reflect individual mental representations of the data.

In order to facilitate the sharing of mental representations, we created an online wiki for uploading any information relevant to solving the puzzle. Later, it also served as an approximate written record of our earlier collaborative work. Other initial methods of pooling data include a relationship diagram of entities created using sticky notes [Fig. 8a, 8b], and a timeline of events using pen and paper [Fig. 9a, 9b].

 

Figure 9a. Cluster diagram created using sticky notes.

Figure 8b. Discussion over the cluster diagram.

 

Figure 9a. Timeline created using pen and paper.

Figure 9b. Discussion over the timeline.

 

Information was aggregated using Google Docs and Spreadsheets, which allow many users to collaborate synchronously on a single document [Fig.10a, 10b]. In order to avoid overriding each other’s work, we reserved sections by highlighting cells that we were currently working on.

 

Figure 10a. Database of entity relations in Google Spreadsheet. The information recorded are as follows: object A, object B, the strength of the evidence that supports their relations, the description of their relations, the date of the evidence, and the source of the evidence.

Figure 10b. Database of events in Google Spreadsheet. The information recorded are as follows: event name, start date and time, end date and time, the category or the sub-plot that the event belongs to, the location of the event, and miscellaneous notes.

 

During this stage, we used Excel to create some experimental visualizations in order to get a preliminary sense of the overarching story. For instance, by sorting the import permits by category and applying conditional formatting, we noticed that Abu Hassan and Global Ways traded exclusively with each other [Fig.11a]. Following up on Hassan’s import permits revealed that his shipments frequently fluctuated in quantity [Fig.11b]. We also attempted to create a geo-temporal visualization showing the locations of important characters over time [Fig. 11c].

 

Figure 11a. Conditional formatting reveals that Abu Hassan and Global Ways trade exclusively with each other.

Figure 11b. An experimental visualization of Abu Hassan’s import permits across time. An unusual peak in the September 2004 permit is evident.

Figure 11c. An experimental visualization that shows location of characters across time. Time and location correspond to the X- and the Y-axis, respectively. Every colored point indicates the known location of a character at a given time. Rapper r’Bear, depicted in orange, is often in Shravaana, but attends Luella Vedric’s 8th Annual SPOMA dinner in New York in January of 2004.

 

2. Schematization

In this stage, we converted the pooled information into various diagrams, in an attempt to make sense of what may be happening. In particular, several network diagrams were created to visualize the relations between entities. Emphasis was placed on finding patterns and coherent connections in the data.

GraphViz was used to generate network diagrams automatically, using the Google Spreadsheets database created in the previous stage [Fig.12].  The colors of the links in the generated network diagram represented the strength of the connections between entities. The strength was based on our own subjective ratings on a scale from one to five. In Figure 12, we observe that the relations entered into our database were mostly supported by strong evidence, and were rarely speculative. The diagram also enabled us to distinguish between uni- and bi-directional relations, and if unidirectional, helped us determine the direction.  Finally, the diagram clustered related entities in close proximity, enabling a rapid visual assessment of whether two entities had any relation.  The clustering was also useful in determining the degrees of separation between entities.

 

Figure 12. GraphViz. Strengths of associations are coded as color of links.

 

In order to understand temporal relationships between events, we used the trial version of Timeline Maker [Fig.13]. One of its most useful features was the ability to color-code events by category, allowing us to easily understand the unfolding of related events.

 

Figure 13. Timeline Maker. Events are color-coded by category.

 

CmapTools was another tool used for automatically generating relational information from a database [Fig.14a], which allows manual re-positioning of nodes for organization. It was also used to create a map of motivations, which was used to fuel hypothesis generation by outlining the critical facts and the inferences we could draw from them [Fig.14b]. This diagram was used as a basis for analyzing the loose ends of our stories, and was integral to presenting competing hypotheses.

 

Figure 14a. Diagram of entity relations generated with CmapTools.

Figure 14b. Motivations map created using CmapTools.

 

During this stage, we continued to use the cluster of sticky notes [Fig.8a,b], as well as the paper timeline [Fig.9a,b]. Although they were “low tech,” these representations were powerful in facilitating communication between different analysts, providing personable space where information was exchanged through speech, pointing, gestures, and even body language.

3. Argumentation and Schema Shifting

Even though we shared a common set of database and visualizations, we found that forming a coherent and unified hypothesis was not a simple process. Many scenarios were plausible, supported in some means by the evidence we had gathered.

We tackled this issue of competing hypotheses in two ways. First, we held a day of hypothesis presentations. We took turns explaining our personal views on the solution to the contest problem, highlighting both the supporting evidence and holes in our reasoning. Second, we attempted an analysis of competing hypotheses, loosely following the model proposed by Richard J. Heuer, Jr. [Fig.15]. Through this exercise, we were able to gauge the dependency of each hypothesis to different bits of information.

 

Figure 15. Google Docs discussion on whether or not multiple subplots exist.

 

The map of motivations created during the schematization stage continued to expand in this stage [Fig.14b], and helped identify the emergence of loose ends and unanswered questions. From this diagram, we were then able to individually refine our hypothesis, avoiding explanatory gaps or unsupportable jumps in causality.

4. Decision-Making

The final step was generating a “skeleton hypothesis” that included only facts and inferences we felt confident about, leaving out weak links and speculations. The wilder stories of global intrigue, while excellent at explaining the motivations of characters, lacked the strength of concrete evidence. The skeleton hypothesis was later used as a template for the final answer we submitted.

Collaboration and visualization software discussed in the previous stages of analysis were re-visited during the decision-making process. Visualizations created using CmapTools, for instance, were modified and transformed to try and focus in on different aspects of different groups of characters. Much of the decision-making near the end of the analysis happened face-to-face, and previously computer-based visualizations facilitated the consensus-making process that ultimately led to our proposed solution.

 

 

TOC:  WhoWhatWhereDebriefing - Process - Video